Macro-cuboïd based probabilistic matching for lip-reading digits
نویسندگان
چکیده
In this paper, we present a spatio-temporal feature representation and a probabilistic matching function to recognise lip movements from pronounced digits. Our model (1) automatically selects spatio-temporal features extracted from 10 digit model templates and (2) matches them with probe video sequences. Spatio-temporal features embed lip movements from pronouncing digits and contain more discriminative information than spatial features alone. A model template for each digit is represented by a set of spatiotemporal features at multiple scales. A probabilistic sequence matching function automatically segments a probe video sequence and matches the most likely sequence of digits recognised in the probe sequence. We demonstrate the proposed approach using the CUAVE [23] database and compare our representational scheme with three alternative methods, based on optical flow, intensity gradient and block matching, respectively. The evaluation shows that the proposed approach outperforms the others in recognition accuracy and is robust in coping with variations in probe sequences.
منابع مشابه
Geometrical-based lip-reading using template probabilistic multi-dimension dynamic time warping
By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. In this paper, we present a geometrical-based automatic lip reading system that extracts the lip region from images using conventional techniques, but the contour itself is extracted using a novel...
متن کاملImplicational Scaling of Reading Comprehension Construct: Is it Deterministic or Probabilistic?
In English as a Second Language Teaching and Testing situations, it is common to infer about learners’ reading ability based on his or her total score on a reading test. This assumes the unidimensional and reproducible nature of reading items. However, few researches have been conducted to probe the issue through psychometric analyses. In the present study, the IELTS exemplar module C (1994) wa...
متن کاملPersian Handwritten Digit Recognition Using Particle Swarm Probabilistic Neural Network
Handwritten digit recognition can be categorized as a classification problem. Probabilistic Neural Network (PNN) is one of the most effective and useful classifiers, which works based on Bayesian rule. In this paper, in order to recognize Persian (Farsi) handwritten digit recognition, a combination of intelligent clustering method and PNN has been utilized. Hoda database, which includes 80000 P...
متن کاملEfficient face model for lip reading
There is number of researches on the lip reading. However, there is little discussion about which face model is effect for lip reading. This paper builds various face models which changes the combination of a face part, and changes the feature points. Various experiments were conducted on the conditions which change only model and do not change other algorithms. We apply the active appearance m...
متن کاملTo build a model for implementing automated lip reading which involves Lip motion feature to text conversion
A speech recognition system has three major components: feature extraction, probabilistic modelling of features and classification. In literature, the general approach is to extract the principle components of the lip movement in terms of the lip shape based properties in order to establish a one-to-one correspondence between phonemes of speech and visemes of lip shape. Several modelling and cl...
متن کامل